Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 75
Filtrar
1.
Artigo em Inglês | MEDLINE | ID: mdl-38656847

RESUMO

This article aims to solve the video object segmentation (VOS) task in a scribble-supervised manner, in which VOS models are not only initialized with sparse target scribbles for inference but also trained by sparse scribble annotations. Thus, the annotation burdens for both initialization and training can be substantially lightened. The difficulties of scribble-supervised VOS lie in two aspects: 1) it demands a strong reasoning ability to carefully segment the target given only a sparse initial target scribble and 2) it necessitates learning dense prediction from sparse scribble annotations during training, requiring powerful learning capability. In this work, we propose a reliability-guided hierarchical memory network (RHMNet) for this task, which segments the target in a stepwise expanding strategy w.r.t. the memory reliability level. To be specific, RHMNet maintains a reliability-guided memory bank. It first uses the high-reliability memory to locate the region with high reliability belonging to the target, i.e., highly similar to the initial target scribble. Then, it expands the located high-reliability region to the entire target conditioned on the region itself and all existing memories. In addition, we propose a scribble-supervised learning mechanism to facilitate the model learning for dense prediction. It exploits the pixel-level relations within a single frame and the instance-level variations across multiple frames to take full advantage of the scribble annotations in sequence training samples. The favorable performance on four popular benchmarks demonstrates that our method is promising. Our project is available at: https://github.com/mkg1204/RHMNet-for-SSVOS.

2.
Science ; 383(6689): eadj4591, 2024 Mar 22.
Artigo em Inglês | MEDLINE | ID: mdl-38513023

RESUMO

Brassinosteroids are steroidal phytohormones that regulate plant development and physiology, including adaptation to environmental stresses. Brassinosteroids are synthesized in the cell interior but bind receptors at the cell surface, necessitating a yet to be identified export mechanism. Here, we show that a member of the ATP-binding cassette (ABC) transporter superfamily, ABCB19, functions as a brassinosteroid exporter. We present its structure in both the substrate-unbound and the brassinosteroid-bound states. Bioactive brassinosteroids are potent activators of ABCB19 ATP hydrolysis activity, and transport assays showed that ABCB19 transports brassinosteroids. In Arabidopsis thaliana, ABCB19 and its close homolog, ABCB1, positively regulate brassinosteroid responses. Our results uncover an elusive export mechanism for bioactive brassinosteroids that is tightly coordinated with brassinosteroid signaling.


Assuntos
Transportadores de Cassetes de Ligação de ATP , Proteínas de Arabidopsis , Arabidopsis , Brassinosteroides , Trifosfato de Adenosina/metabolismo , Arabidopsis/genética , Arabidopsis/metabolismo , Proteínas de Arabidopsis/química , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Transportadores de Cassetes de Ligação de ATP/química , Transportadores de Cassetes de Ligação de ATP/genética , Transportadores de Cassetes de Ligação de ATP/metabolismo , Brassinosteroides/metabolismo , Ácidos Indolacéticos/metabolismo , Conformação Proteica
3.
Artigo em Inglês | MEDLINE | ID: mdl-38421845

RESUMO

Natural Language Generation (NLG) accepts input data in the form of images, videos, or text and generates corresponding natural language text as output. Existing NLG methods mainly adopt a supervised approach and rely heavily on coupled data-to-text pairs. However, for many targeted scenarios and for non-English languages, sufficient quantities of labeled data are often not available. As a result, it is necessary to collect and label data-text pairs for training, which is both costly and time-consuming. To relax the dependency on labeled data of downstream tasks, we propose an intuitive and effective zero-shot learning framework, ZeroNLG, which can deal with multiple NLG tasks, including image-to-text (image captioning), video-to-text (video captioning), and text-to-text (neural machine translation), across English, Chinese, German, and French within a unified framework. ZeroNLG does not require any labeled downstream pairs for training. During training, ZeroNLG (i) projects different domains (across modalities and languages) to corresponding coordinates in a shared common latent space; (ii) bridges different domains by aligning their corresponding coordinates in this space; and (iii) builds an unsupervised multilingual auto-encoder to learn to generate text by reconstructing the input text given its coordinate in shared latent space. Consequently, during inference, based on the data-to-text pipeline, ZeroNLG can generate target sentences across different languages given the coordinate of input data in the common space. Within this unified framework, given visual (imaging or video) data as input, ZeroNLG can perform zero-shot visual captioning; given textual sentences as input, ZeroNLG can perform zero-shot machine translation. We present the results of extensive experiments on twelve NLG tasks, showing that, without using any labeled downstream pairs for training, ZeroNLG generates high-quality and "believable" outputs and significantly outperforms existing zero-shot methods. Our code and data are available at https://github.com/yangbang18/ZeroNLG.

4.
Can J Infect Dis Med Microbiol ; 2024: 6698387, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38361762

RESUMO

To evaluate the prevalence and quality of antimicrobial prescriptions using a Global Point Prevalence Survey (PPS) tool and help identify targets for improvement of antimicrobial prescribing and inform the development of antimicrobial stewardship activities. Antimicrobial prescriptions for inpatients staying at a hospital overnight were surveyed on one weekday in October 2018, November 2019, and November 2020. Data including basic patient information, antimicrobial drugs, quality evaluation of antimicrobial drug prescription, and the risk factors of nosocomial infection were collected from doctor network workstation. Patient information was anonymized and entered in the PPS Web application by physicians. A total of 720 patients (median age, 62 years) were surveyed. Of them, 246 (34.2%) were prescribed antimicrobials on the survey days. Hospital-wide antimicrobial use had a significantly decreasing trend (P < 0.001). The most commonly prescribed antimicrobial drugs were third-generation cephalosporins (40.5%), followed by quinolones (21.8%) and second-generation cephalosporin (12.5%). In our study, cefoperazone/sulbactam, ceftazidime, and levofloxacin were the most commonly used antimicrobials. The most common indication for antimicrobial use was pneumonia or lower respiratory tract infection (159/321, 49.5%). Antimicrobial for surgical prophylaxis represented 16.2% of the total antibiotic doses. Of those, 67.3% were administered for more than 24 h. The rate of adherence to antibiotic guidelines was 61.4%. The indications for antimicrobials were not documented in 54.5% of the prescriptions. Stop/review date was documented for 36.8% of prescriptions. The PPS tool is useful in identifying targets to enhance the quality of antimicrobial prescriptions to improve the adherence rate in hospitals. This survey can be used as a control to assess the rational application quality of antimicrobial after regular application of antimicrobial intervention.

5.
IEEE Trans Image Process ; 33: 1059-1069, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38265894

RESUMO

This paper presents a novel fine-grained task for traffic accident analysis. Accident detection in surveillance or dashcam videos is a common task in the field of traffic accident analysis by using videos. However, common accident detection does not analyze the specific particulars of the accident, only identifies the accident's existence or occurrence time in a video. In this paper, we define the novel fine-grained accident detection task which contains fine-grained accident classification, temporal-spatial occurrence region localization, and accident severity estimation. A transformer-based framework combining the RGB and optical flow information of videos is proposed for fine-grained accident detection. Additionally, we introduce a challenging Fine-grained Accident Detection (FAD) database that covers multiple tasks in surveillance videos which places more emphasis on the overall perspective. Experimental results demonstrate that our model could effectively extract the video features for multiple tasks, indicating that current traffic accident analysis has limitations in dealing with the FAD task and that further research is indeed needed.

6.
Artigo em Inglês | MEDLINE | ID: mdl-38241099

RESUMO

Multidomain crowd counting aims to learn a general model for multiple diverse datasets. However, deep networks prefer modeling distributions of the dominant domains instead of all domains, which is known as domain bias. In this study, we propose a simple-yet-effective modulating domain-specific knowledge network (MDKNet) to handle the domain bias issue in multidomain crowd counting. MDKNet is achieved by employing the idea of "modulating", enabling deep network balancing and modeling different distributions of diverse datasets with little bias. Specifically, we propose an instance-specific batch normalization (IsBN) module, which serves as a base modulator to refine the information flow to be adaptive to domain distributions. To precisely modulating the domain-specific information, the domain-guided virtual classifier (DVC) is then introduced to learn a domain-separable latent space. This space is employed as an input guidance for the IsBN modulator, such that the mixture distributions of multiple datasets can be well treated. Extensive experiments performed on popular benchmarks, including Shanghai-tech A/B, QNRF, and NWPU validate the superiority of MDKNet in tackling multidomain crowd counting and the effectiveness for multidomain learning. Code is available at https://github.com/csguomy/MDKNet.

7.
IEEE Trans Cybern ; 54(3): 1997-2010, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-37824314

RESUMO

Different from visible cameras which record intensity images frame by frame, the biologically inspired event camera produces a stream of asynchronous and sparse events with much lower latency. In practice, visible cameras can better perceive texture details and slow motion, while event cameras can be free from motion blurs and have a larger dynamic range which enables them to work well under fast motion and low illumination (LI). Therefore, the two sensors can cooperate with each other to achieve more reliable object tracking. In this work, we propose a large-scale Visible-Event benchmark (termed VisEvent) due to the lack of a realistic and scaled dataset for this task. Our dataset consists of 820 video pairs captured under LI, high speed, and background clutter scenarios, and it is divided into a training and a testing subset, each of which contains 500 and 320 videos, respectively. Based on VisEvent, we transform the event flows into event images and construct more than 30 baseline methods by extending current single-modality trackers into dual-modality versions. More importantly, we further build a simple but effective tracking algorithm by proposing a cross-modality transformer, to achieve more effective feature fusion between visible and event data. Extensive experiments on the proposed VisEvent dataset, FE108, COESOT, and two simulated datasets (i.e., OTB-DVS and VOT-DVS), validated the effectiveness of our model. The dataset and source code have been released on: https://github.com/wangxiao5791509/VisEvent_SOT_Benchmark.

8.
Artigo em Inglês | MEDLINE | ID: mdl-37796672

RESUMO

Unpaired medical image enhancement (UMIE) aims to transform a low-quality (LQ) medical image into a high-quality (HQ) one without relying on paired images for training. While most existing approaches are based on Pix2Pix/CycleGAN and are effective to some extent, they fail to explicitly use HQ information to guide the enhancement process, which can lead to undesired artifacts and structural distortions. In this article, we propose a novel UMIE approach that avoids the above limitation of existing methods by directly encoding HQ cues into the LQ enhancement process in a variational fashion and thus model the UMIE task under the joint distribution between the LQ and HQ domains. Specifically, we extract features from an HQ image and explicitly insert the features, which are expected to encode HQ cues, into the enhancement network to guide the LQ enhancement with the variational normalization module. We train the enhancement network adversarially with a discriminator to ensure the generated HQ image falls into the HQ domain. We further propose a content-aware loss to guide the enhancement process with wavelet-based pixel-level and multiencoder-based feature-level constraints. Additionally, as a key motivation for performing image enhancement is to make the enhanced images serve better for downstream tasks, we propose a bi-level learning scheme to optimize the UMIE task and downstream tasks cooperatively, helping generate HQ images both visually appealing and favorable for downstream tasks. Experiments on three medical datasets verify that our method outperforms existing techniques in terms of both enhancement quality and downstream task performance. The code and the newly collected datasets are publicly available at https://github.com/ChunmingHe/HQG-Net.

9.
JACS Au ; 3(8): 2166-2173, 2023 Aug 28.
Artigo em Inglês | MEDLINE | ID: mdl-37654585

RESUMO

Numerous chemical transformations require two or more catalytically active sites that act in a concerted manner; nevertheless, designing heterogeneous catalysts with such multiple functionalities remains an overwhelming challenge. Herein, it is shown that by the integration of acidic flexible polymers and Pd-metallated covalent organic framework (COF) hosts, the merits of both catalytically active sites can be utilized to realize heterogeneous synergistic catalysis that are active in the conversion of nitrobenzenes to carbamates via reductive carbonylation. The concentrated catalytically active species in the nanospace force two catalytic components into proximity, thereby enhancing the cooperativity between the acidic species and Pd species to facilitate synergistic catalysis. The resulting host-guest assemblies constitute more efficient systems than the corresponding physical mixtures and the homogeneous counterparts. Furthermore, this system enables easy access to a family of important derivatives such as herbicides and polyurethane monomers and can be integrated with other COFs, showing promising results. This study utilizes host-guest assembly as a versatile tool for the fabrication of multifunctional catalysts with enhanced cooperativity between different catalytic species.

10.
J Med Ultrasound ; 31(2): 92-100, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37576422

RESUMO

Contrast-enhanced ultrasound (CEUS) uses an intravascular contrast agent to enhance blood flow signals and assess microcirculation in different parts of the human body. Over the past decade, CEUS has become more widely applied in musculoskeletal (MSK) medicine, and the current review aims to systematically summarize current research on the application of CEUS in the MSK field, focusing on 67 articles published between January 2001 and June 2021 in online databases including PubMed, Scopus, and Embase. CEUS has been widely used for the clinical assessment of muscle microcirculation, tendinopathy, fracture nonunions, sports-related injuries, arthritis, peripheral nerves, and tumors, and can serve as an objective and quantitative evaluation tool for prognosis and outcome prediction. Optimal CEUS parameters and diagnostic cut off values for each disease category remain to be confirmed.

11.
Artigo em Inglês | MEDLINE | ID: mdl-37624720

RESUMO

In person re-identification (re-ID), extracting part-level features from person images has been verified to be crucial to offer fine-grained information. Most of the existing CNN-based methods only locate the human parts coarsely, or rely on pretrained human parsing models and fail in locating the identifiable nonhuman parts (e.g., knapsack). In this article, we introduce an alignment scheme in transformer architecture for the first time and propose the auto-aligned transformer (AAformer) to automatically locate both the human parts and nonhuman ones at patch level. We introduce the "Part tokens (PARTs)", which are learnable vectors, to extract part features in the transformer. A PART only interacts with a local subset of patches in self-attention and learns to be the part representation. To adaptively group the image patches into different subsets, we design the auto-alignment. Auto-alignment employs a fast variant of optimal transport (OT) algorithm to online cluster the patch embeddings into several groups with the PARTs as their prototypes. AAformer integrates the part alignment into the self-attention and the output PARTs can be directly used as part features for retrieval. Extensive experiments validate the effectiveness of PARTs and the superiority of AAformer over various state-of-the-art methods.

12.
IEEE Trans Pattern Anal Mach Intell ; 45(11): 13117-13133, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37390000

RESUMO

Our goal in this research is to study a more realistic environment in which we can conduct weakly-supervised multi-modal instance-level product retrieval for fine-grained product categories. We first contribute the Product1M datasets and define two real practical instance-level retrieval tasks that enable evaluations on price comparison and personalized recommendations. For both instance-level tasks, accurately identifying the intended product target mentioned in visual-linguistic data and mitigating the impact of irrelevant content are quite challenging. To address this, we devise a more effective cross-modal pretraining model capable of adaptively incorporating key concept information from multi-modal data. This is accomplished by utilizing an entity graph, where nodes represented entities and edges denoted the similarity relations between them. Specifically, a novel Entity-Graph Enhanced Cross-Modal Pretraining (EGE-CMP) model is proposed for instance-level commodity retrieval, which explicitly injects entity knowledge in both node-based and subgraph-based ways into the multi-modal networks via a self-supervised hybrid-stream transformer. This could reduce the confusion between different object contents, thereby effectively guiding the network to focus on entities with real semantics. Experimental results sufficiently verify the efficacy and generalizability of our EGE-CMP, outperforming several SOTA cross-modal baselines like CLIP Radford et al. 2021, UNITER Chen et al. 2020 and CAPTURE Zhan et al. 2021.

13.
Nat Chem Biol ; 19(11): 1331-1341, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37365405

RESUMO

Brassinosteroids (BRs) are steroidal phytohormones that are essential for plant growth, development and adaptation to environmental stresses. BRs act in a dose-dependent manner and do not travel over long distances; hence, BR homeostasis maintenance is critical for their function. Biosynthesis of bioactive BRs relies on the cell-to-cell movement of hormone precursors. However, the mechanism of the short-distance BR transport is unknown, and its contribution to the control of endogenous BR levels remains unexplored. Here we demonstrate that plasmodesmata (PD) mediate the passage of BRs between neighboring cells. Intracellular BR content, in turn, is capable of modulating PD permeability to optimize its own mobility, thereby manipulating BR biosynthesis and signaling. Our work uncovers a thus far unknown mode of steroid transport in eukaryotes and exposes an additional layer of BR homeostasis regulation in plants.


Assuntos
Proteínas de Arabidopsis , Brassinosteroides , Plasmodesmos/metabolismo , Reguladores de Crescimento de Plantas , Plantas/metabolismo , Hormônios , Regulação da Expressão Gênica de Plantas , Proteínas de Arabidopsis/metabolismo
14.
Front Immunol ; 14: 1158457, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37122735

RESUMO

Introduction: Dysregulated inflammation and coagulation are underlying mechanisms driving organ injury after trauma and hemorrhagic shock. Heparan sulfates, cell surface glycosaminoglycans abundantly expressed on the endothelial surface, regulate a variety of cellular processes. Endothelial heparan sulfate containing a rare 3-O-sulfate modification on a glucosamine residue is anticoagulant and anti-inflammatory through high-affinity antithrombin binding and sequestering of circulating damage-associated molecular pattern molecules. Our goal was to evaluate therapeutic potential of a synthetic 3-O-sulfated heparan sulfate dodecasaccharide (12-mer, or dekaparin) to attenuate thromboinflammation and prevent organ injury. Methods: Male Sprague-Dawley rats were pre-treated subcutaneously with vehicle (saline) or dekaparin (2 mg/kg) and subjected to a trauma/hemorrhagic shock model through laparotomy, gut distention, and fixed-pressure hemorrhage. Vehicle and dekaparin-treated rats were resuscitated with Lactated Ringer's solution (LR) and compared to vehicle-treated fresh-frozen-plasma-(FFP)-resuscitated rats. Serial blood samples were collected at baseline, after induction of shock, and 3 hours after fluid resuscitation to measure hemodynamic and metabolic shock indicators, inflammatory mediators, and thrombin-antithrombin complex formation. Lungs and kidneys were processed for organ injury scoring and immunohistochemical analysis to quantify presence of neutrophils. Results: Induction of trauma and hemorrhagic shock resulted in significant increases in thrombin-antithrombin complex, inflammatory markers, and lung and kidney injury scores. Compared to vehicle, dekaparin treatment did not affect induction, severity, or recovery of shock as indicated by hemodynamics, metabolic indicators of shock (lactate and base excess), or metrics of bleeding, including overall blood loss, resuscitation volume, or hematocrit. While LR-vehicle-resuscitated rodents exhibited increased lung and kidney injury, administration of dekaparin significantly reduced organ injury scores and was similar to organ protection conferred by FFP resuscitation. This was associated with a significant reduction in neutrophil infiltration in lungs and kidneys and reduced lung fibrin deposition among dekaparin-treated rats compared to vehicle. No differences in organ injury, neutrophil infiltrates, or fibrin staining between dekaparin and FFP groups were observed. Finally, dekaparin treatment attenuated induction of thrombin-antithrombin complex and inflammatory mediators in plasma following trauma and hemorrhagic shock. Conclusion: Anti-thromboinflammatory properties of a synthetic 3-O-sulfated heparan sulfate 12-mer, dekaparin, could provide therapeutic benefit for mitigating organ injury following major trauma and hemorrhagic shock.


Assuntos
Choque Hemorrágico , Trombose , Ratos , Masculino , Animais , Ratos Sprague-Dawley , Choque Hemorrágico/complicações , Choque Hemorrágico/tratamento farmacológico , Tromboinflamação , Inflamação/tratamento farmacológico , Inflamação/complicações , Sulfatos/uso terapêutico , Trombose/complicações , Heparitina Sulfato , Fibrina
15.
IEEE Trans Image Process ; 32: 2947-2959, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37195843

RESUMO

Measuring the similarity of two images is of crucial importance in computer vision. Class agnostic common object detection is a nascent research topic about mining image similarity, which aims to detect common object pairs from two images without category information. This task is general and less restrictive which explores the similarity between objects and can further describe the commonality of image pairs at the object level. However, previous works suffer from features with low discrimination caused by the lack of category information. Moreover, most existing methods compare objects extracted from two images in a simple and direct way, ignoring the internal relationships between objects in the two images. To overcome these limitations, in this paper, we propose a new framework called TransWeaver, which learns intrinsic relationships between objects. Our TransWeaver takes image pairs as input and flexibly captures the inherent correlation between candidate objects from two images. It consists of two modules (i.e., the representation-encoder and the weave-decoder) and captures efficient context information by weaving image pairs to make them interact with each other. The representation-encoder is used for representation learning, which can obtain more discriminative representations for candidate proposals. Furthermore, the weave-decoder weaves the objects from two images and is able to explore the inter-image and intra-image context information at the same time, bringing a better object matching ability. We reorganize the PASCAL VOC, COCO, and Visual Genome datasets to obtain training and testing image pairs. Extensive experiments demonstrate the effectiveness of the proposed TransWeaver which achieves state-of-the-art performance on all datasets.

16.
Artigo em Inglês | MEDLINE | ID: mdl-37022245

RESUMO

Regression based multi-person pose estimation receives increasing attention because of its promising potential in achieving realtime inference. However, the challenges in long-range 2D offset regression have restricted the regression accuracy, leading to a considerable performance gap compared with heatmap based methods. This paper tackles the challenge of long-range regression through simplifying the 2D offset regression to a classification task. We present a simple yet effective method, named PolarPose, to perform 2D regression in Polar coordinate. Through transforming the 2D offset regression in Cartesian coordinate to quantized orientation classification and 1D length estimation in the Polar coordinate, PolarPose effectively simplifies the regression task, making the framework easier to optimize. Moreover, to further boost the keypoint localization accuracy in PolarPose, we propose a multi-center regression to relieve the quantization error during orientation quantization. The resulting PolarPose framework is able to regress the keypoint offsets in a more reliable way, and achieves more accurate keypoint localization. Tested with the single-model and single-scale setting, PolarPose achieves the AP of 70.2% on COCO test-dev dataset, outperforming the state-of-the-art regression based methods. PolarPose also achieves promising efficiency, e.g., 71.5% AP at 21.5FPS and 68.5%AP at 24.2FPS and 65.5%AP at 27.2FPS on COCO val2017 dataset, faster than current state-of-the-art.

17.
Sci Rep ; 13(1): 5815, 2023 04 10.
Artigo em Inglês | MEDLINE | ID: mdl-37037835

RESUMO

The TRPM4 gene codes for a membrane ion channel subunit related to inflammation in the central nervous system. Recent investigation has identified an association between TRPM4 single nucleotide polymorphisms (SNPs) rs8104571 and rs150391806 and increased intracranial (ICP) pressure following traumatic brain injury (TBI). We assessed the influence of these genotypes on clinical outcomes and ICP in TBI patients. We included 292 trauma patients with TBI. DNA extraction and real-time PCR were used for TRPM4 rs8104571 and rs150391806 allele discrimination. Five participants were determined to have the rs8104571 homozygous variant genotype, and 20 participants were identified as heterozygotes; 24 of these 25 participants were African American. No participants had rs150391806 variant alleles, preventing further analysis of this SNP. Genotypes containing the rs8104571 variant allele were associated with decreased Glasgow outcome scale-extended (GOSE) score (P = 0.0231), which was also consistent within our African-American subpopulation (P = 0.0324). Regression analysis identified an association between rs8104571 variant homozygotes and mortality within our overall population (P = 0.0230) and among African Americans (P = 0.0244). Participants with rs8104571 variant genotypes exhibited an overall increase in ICP (P = 0.0077), although a greater frequency of ICP measurements > 25 mmHg was observed in wild-type participants (P = < 0.0001). We report an association between the TRPM4 rs8104571 variant allele and poor outcomes following TBI. These findings can potentially be translated into a precision medicine approach for African Americans following TBI utilizing TRPM4-specific pharmaceutical interventions. Validation through larger cohorts is warranted.


Assuntos
Lesões Encefálicas Traumáticas , Canais de Cátion TRPM , Humanos , Negro ou Afro-Americano/genética , Pressão Intracraniana/fisiologia , Lesões Encefálicas Traumáticas/genética , Lesões Encefálicas Traumáticas/complicações , Genótipo , Escala de Resultado de Glasgow , Canais de Cátion TRPM/genética
18.
Artigo em Inglês | MEDLINE | ID: mdl-37018296

RESUMO

While deep-learning-based tracking methods have achieved substantial progress, they entail large-scale and high-quality annotated data for sufficient training. To eliminate expensive and exhaustive annotation, we study self-supervised (SS) learning for visual tracking. In this work, we develop the crop-transform-paste operation, which is able to synthesize sufficient training data by simulating various appearance variations during tracking, including appearance variations of objects and background interference. Since the target state is known in all synthesized data, existing deep trackers can be trained in routine ways using the synthesized data without human annotation. The proposed target-aware data-synthesis method adapts existing tracking approaches within a SS learning framework without algorithmic changes. Thus, the proposed SS learning mechanism can be seamlessly integrated into existing tracking frameworks to perform training. Extensive experiments show that our method: 1) achieves favorable performance against supervised (Su) learning schemes under the cases with limited annotations; 2) helps deal with various tracking challenges such as object deformation, occlusion (OCC), or background clutter (BC) due to its manipulability; 3) performs favorably against the state-of-the-art unsupervised tracking methods; and 4) boosts the performance of various state-of-the-art Su learning frameworks, including SiamRPN++, DiMP, and TransT.

19.
IEEE Trans Pattern Anal Mach Intell ; 45(8): 9454-9468, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37022836

RESUMO

With convolution operations, Convolutional Neural Networks (CNNs) are good at extracting local features but experience difficulty to capture global representations. With cascaded self-attention modules, vision transformers can capture long-distance feature dependencies but unfortunately deteriorate local feature details. In this paper, we propose a hybrid network structure, termed Conformer, to take both advantages of convolution operations and self-attention mechanisms for enhanced representation learning. Conformer roots in feature coupling of CNN local features and transformer global representations under different resolutions in an interactive fashion. Conformer adopts a dual structure so that local details and global dependencies are retained to the maximum extent. We also propose a Conformer-based detector (ConformerDet), which learns to predict and refine object proposals, by performing region-level feature coupling in an augmented cross-attention fashion. Experiments on ImageNet and MS COCO datasets validate Conformer's superiority for visual recognition and object detection, demonstrating its potential to be a general backbone network.


Assuntos
Algoritmos , Aprendizagem , Redes Neurais de Computação
20.
Neural Netw ; 162: 147-161, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-36907005

RESUMO

Regional wind speed prediction plays an important role in the development of wind power, which is usually recorded in the form of two orthogonal components, namely U-wind and V-wind. The regional wind speed has the characteristics of diverse variations, which are reflected in three aspects: (1) The spatially diverse variations of regional wind speed indicate that wind speed has different dynamic patterns at different positions; (2) The distinct variations between U-wind and V-wind denote that U-wind and V-wind at the same position exhibit different dynamic patterns; (3) The non-stationary variations of wind speed represent that the intermittent and chaotic nature of wind speed. In this paper, we propose a novel framework named Wind Dynamics Modeling Network (WDMNet) to model the diverse variations of regional wind speed and make accurate multi-step predictions. To jointly capture the spatially diverse variations and the distinct variations between U-wind and V-wind, WDMNet leverages a new neural block called Involution Gated Recurrent Unit Partial Differential Equation (Inv-GRU-PDE) as its key component. The block adopts involution to model spatially diverse variations and separately constructs hidden driven PDEs of U-wind and V-wind. The construction of PDEs in this block is achieved by a new Involution PDE (InvPDE) layers. Besides, a deep data-driven model is also introduced in Inv-GRU-PDE block as the complement to the constructed hidden PDEs for sufficiently modeling regional wind dynamics. Finally, to effectively capture the non-stationary variations of wind speed, WDMNet follows a time-variant structure for multi-step predictions. Comprehensive experiments have been conducted on two real-world datasets. Experimental results demonstrate the effectiveness and superiority of the proposed method over state-of-the-art techniques.


Assuntos
Vento
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...